NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Learning to Design Accurate Deep Learning Accelerators with Inaccurate Multipliers

https://doi.org/10.23919/DATE54114.2022.9774607

Jain, Paras; Huda, Safeen; Maas, Martin; Gonzalez, Joseph E.; Stoical, Ion; Mirhoseini, Azalia (March 2022, 2022 Design, Automation & Test in Europe Conference & Exhibition (DATE))

Approximate computing is a promising way to improve the power efficiency of deep learning. While recent work proposes new arithmetic circuits (adders and multipliers) that consume substantially less power at the cost of computation errors, these approximate circuits decrease the end-to-end accuracy of common models. We present AutoApprox, a framework to automatically generate approximate low-power deep learning accelerators without any accuracy loss. AutoApprox generates a wide range of approximate ASIC accelerators with a TPUv3 systolic-array template. AutoApprox uses a learned router to assign each DNN layer to an approximate systolic array from a bank of arrays with varying approximation levels. By tailoring this routing for a specific neural network architecture, we discover circuit designs without the accuracy penalty from prior methods. Moreover, AutoApprox optimizes for the end-to-end performance, power and area of the the whole chip and PE mapping rather than simply measuring the performance of the arithmetic units in iso-lation. To our knowledge, our work is the first to demonstrate the effectiveness of custom-tailored approximate circuits in delivering significant chip-level energy savings with zero accuracy loss on a large-scale dataset such as ImageNet. AutoApprox synthesizes a novel approximate accelerator based on the TPU that reduces end-to-end power consumption by 3.2% and area by 5.2% at a sub-10nm process with no degradation in ImageNet validation top-1 and top-5 accuracy.
more » « less
Full Text Available
Reinforcement Learning for Electronic Design Automation: Case Studies and Perspectives: (Invited Paper)

https://doi.org/10.1109/ASP-DAC52403.2022.9712578

Budak, Ahmet F.; Jiang, Zixuan; Zhu, Keren; Mirhoseini, Azalia; Goldie, Anna; Pan, David Z. (January 2022, IEEE/ACM Asian and South Pacific Design Automation Conference (ASP-DAC))

Full Text Available
Deep Mixture of Experts via Shallow Embedding

Wang, Xin; Yu, Fisher; Dunlap, Lisa; Ma, Yi-An; Wang, Ruth; Mirhoseini, Azalia; Darrell, Trevor; Gonzalez, Joseph E. (July 2019, Uncertainty in artificial intelligence)

Larger networks generally have greater representational power at the cost of increased computational complexity. Sparsifying such networks has been an active area of research but has been generally limited to static regularization or dynamic approaches using reinforcement learning. We explore a mixture of experts (MoE) approach to deep dynamic routing, which activates certain experts in the network on a per-example basis. Our novel DeepMoE architecture increases the representational power of standard convolutional networks by adaptively sparsifying and recalibrating channel-wise features in each convolutional layer. We employ a multi-headed sparse gating network to determine the selection and scaling of channels for each input, leveraging exponential combinations of experts within a single convolutional network. Our proposed architecture is evaluated on four benchmark datasets and tasks, and we show that Deep-MoEs are able to achieve higher accuracy with lower computation than standard convolutional networks.
more » « less
Full Text Available

Search for: All records